Národní úložiště šedé literatury Nalezeno 8 záznamů.  Hledání trvalo 0.00 vteřin. 
Penetrační testy systému pro verifikaci řečníka
Nguyen, QuangTrang ; Rohdin, Johan Andréas (oponent) ; Plchot, Oldřich (vedoucí práce)
Cílem bakalářské práce je návrhnout sadu penetračních testů pro verifikaci řečníka s použítím syntézy řeči a dostupných nahrávek cílových mluvčí. Práce zahrnuje studium problematiky pro syntézu řeči, verifikace řečníka a metod pro spoofing se kterými můžeme setkat. Před samotným návrhem testovací sady je popsán systém a jeho komponenty, který byl použít v této práci. V posledních kapitolách práce je uveden popis návrhu testovacích sad a způsob realizace testů. Na závěru jsou vyhodnoceny výsledky a je odpovězeno na otázku, zda je možné prolomit systém pro verfikaci řečníka s využitím metody pro syntézu řeči.
Audio signal modelling using neural networks
Pešán, Michele ; Ištvánek, Matěj (oponent) ; Miklánek, Štěpán (vedoucí práce)
Neural networks based upon the WaveNet architecture and recurrent neural networks are nowadays used in human speech synthesis and other various tasks such as "black-box" modeling systems for acoustic signals alteration (modulation effects, non-linear distortion units, etc.). This work aims, to sum up existing methods of neural network use in acoustic signal modeling. Next, the student is to implement chosen model of neuron network Python and will train this architecture to perform a simulation of desirable sound effect or acoustic alteration system. The task for this semester is, to sum up existing knowledge concerning neural networks. Training database of sound samples and implementation of a sound modeling neural net is to be created as well. Through recent years, neural networks have been used more and more extensively across many science fields. Neural networks based upon the WaveNet architecture and recurrent neural networks are nowadays used in human speech synthesis and other various tasks such as "black-box" modeling systems for acoustic signals alteration (modulation effects, non-linear distortion units, etc.). This academic work provides a brief introduction to the neural network terminology and common practice, elaborates on several types of neural network types, the main focus on DeepMind's WaveNet. Furthermore describes and compares results of experimental implementation of WaveNet and other types of neural network in audio signal "black-box" modeling tasks.
Modelování hudby na úrovni signálu pomocí WaveNetu
Slanináková, Terézia ; Landini, Federico Nicolás (oponent) ; Beneš, Karel (vedoucí práce)
Práca sa zaoberá skúmaním možnosti modelovania hudby a reči pomocou WaveNetu, hlbokou neurónovou sieťou pre generovanie zvuku na úrovni signálu. Za pomoci existujúcich implementácií bol WaveNet netrénovaný na rôznych datasetoch a vyprodukoval mnohé zvukové súbory. Bolo vykonaných niekoľko experimentov s rôznym nastavením hyperparametrov WaveNetu. Taktiež bolo použitých niekoľko schém generovania, každá s rôznym vplyvom na generovaný výsledok. Kvalita výstupných zvukových súborov bola ohodnotená na základe dotazníku. Hudobné zvukové stopy dosiahli skóre 2-3.1818 na 5-bodovej škále, čo je porovnateľné s  hudobnými nahrávkami originálneho výskumného tímu (3.1818).
Non-Parallel Voice Conversion
Brukner, Jan ; Plchot, Oldřich (oponent) ; Černocký, Jan (vedoucí práce)
Voice conversion (VC) aims at converting the voice of source speaker to the voice of target speaker. It is popular in funny Internet videos but has also series of serious use cases, such as dubbing of audiovisual material and anonymization of voice (for example for witness protection). As it can serve for spoofing of voice identification systems, it is also an important tool for development spoofing detectors and counter-measures.     Training VC models has mainly been on parallel audios (ie. two speakers uttering the same text) and on high quality audio material. The goal of this thesis was to investigate developing VC on non-parallel data and with low quality signals, mainly from publicly available dataset VoxCeleb.  This work follows the state-of-the-art AutoVC architecture defined by Qian et al. It is based on neural network (NN) autoencoders, aiming to separate speech into content- and speaker-dependent embedding. The target speech is then obtained by replacing source speaker embedding by the target speaker one. We have improved Qian's architecture to process low-quality audio by experimenting with different speaker embeddings (d-vectors vs. x-vectors), introducing a speaker classifier from content embeddings in an adversarial setup, and tuning the size of content embeddings imposing an information bottleneck to the autoencoder. Also, we have defined another adversarial architecture by comparing original content embeddings with those obtained after the VC process. The results of experiments prove that non-parallel VC on low-quality data is indeed doable. The resulting audios were not so good as in case of using high-quality ones, but the speaker verification results after spoofing by proposed system have clearly shown a shift of voice characteristics toward the target speakers.
Audio signal modelling using neural networks
Pešán, Michele ; Ištvánek, Matěj (oponent) ; Miklánek, Štěpán (vedoucí práce)
Neural networks based upon the WaveNet architecture and recurrent neural networks are nowadays used in human speech synthesis and other various tasks such as "black-box" modeling systems for acoustic signals alteration (modulation effects, non-linear distortion units, etc.). This work aims, to sum up existing methods of neural network use in acoustic signal modeling. Next, the student is to implement chosen model of neuron network Python and will train this architecture to perform a simulation of desirable sound effect or acoustic alteration system. The task for this semester is, to sum up existing knowledge concerning neural networks. Training database of sound samples and implementation of a sound modeling neural net is to be created as well. Through recent years, neural networks have been used more and more extensively across many science fields. Neural networks based upon the WaveNet architecture and recurrent neural networks are nowadays used in human speech synthesis and other various tasks such as "black-box" modeling systems for acoustic signals alteration (modulation effects, non-linear distortion units, etc.). This academic work provides a brief introduction to the neural network terminology and common practice, elaborates on several types of neural network types, the main focus on DeepMind's WaveNet. Furthermore describes and compares results of experimental implementation of WaveNet and other types of neural network in audio signal "black-box" modeling tasks.
Penetrační testy systému pro verifikaci řečníka
Nguyen, QuangTrang ; Rohdin, Johan Andréas (oponent) ; Plchot, Oldřich (vedoucí práce)
Cílem bakalářské práce je návrhnout sadu penetračních testů pro verifikaci řečníka s použítím syntézy řeči a dostupných nahrávek cílových mluvčí. Práce zahrnuje studium problematiky pro syntézu řeči, verifikace řečníka a metod pro spoofing se kterými můžeme setkat. Před samotným návrhem testovací sady je popsán systém a jeho komponenty, který byl použít v této práci. V posledních kapitolách práce je uveden popis návrhu testovacích sad a způsob realizace testů. Na závěru jsou vyhodnoceny výsledky a je odpovězeno na otázku, zda je možné prolomit systém pro verfikaci řečníka s využitím metody pro syntézu řeči.
Non-Parallel Voice Conversion
Brukner, Jan ; Plchot, Oldřich (oponent) ; Černocký, Jan (vedoucí práce)
Voice conversion (VC) aims at converting the voice of source speaker to the voice of target speaker. It is popular in funny Internet videos but has also series of serious use cases, such as dubbing of audiovisual material and anonymization of voice (for example for witness protection). As it can serve for spoofing of voice identification systems, it is also an important tool for development spoofing detectors and counter-measures.     Training VC models has mainly been on parallel audios (ie. two speakers uttering the same text) and on high quality audio material. The goal of this thesis was to investigate developing VC on non-parallel data and with low quality signals, mainly from publicly available dataset VoxCeleb.  This work follows the state-of-the-art AutoVC architecture defined by Qian et al. It is based on neural network (NN) autoencoders, aiming to separate speech into content- and speaker-dependent embedding. The target speech is then obtained by replacing source speaker embedding by the target speaker one. We have improved Qian's architecture to process low-quality audio by experimenting with different speaker embeddings (d-vectors vs. x-vectors), introducing a speaker classifier from content embeddings in an adversarial setup, and tuning the size of content embeddings imposing an information bottleneck to the autoencoder. Also, we have defined another adversarial architecture by comparing original content embeddings with those obtained after the VC process. The results of experiments prove that non-parallel VC on low-quality data is indeed doable. The resulting audios were not so good as in case of using high-quality ones, but the speaker verification results after spoofing by proposed system have clearly shown a shift of voice characteristics toward the target speakers.
Modelování hudby na úrovni signálu pomocí WaveNetu
Slanináková, Terézia ; Landini, Federico Nicolás (oponent) ; Beneš, Karel (vedoucí práce)
Práca sa zaoberá skúmaním možnosti modelovania hudby a reči pomocou WaveNetu, hlbokou neurónovou sieťou pre generovanie zvuku na úrovni signálu. Za pomoci existujúcich implementácií bol WaveNet netrénovaný na rôznych datasetoch a vyprodukoval mnohé zvukové súbory. Bolo vykonaných niekoľko experimentov s rôznym nastavením hyperparametrov WaveNetu. Taktiež bolo použitých niekoľko schém generovania, každá s rôznym vplyvom na generovaný výsledok. Kvalita výstupných zvukových súborov bola ohodnotená na základe dotazníku. Hudobné zvukové stopy dosiahli skóre 2-3.1818 na 5-bodovej škále, čo je porovnateľné s  hudobnými nahrávkami originálneho výskumného tímu (3.1818).

Chcete být upozorněni, pokud se objeví nové záznamy odpovídající tomuto dotazu?
Přihlásit se k odběru RSS.